138 research outputs found

    Normative thinking on wastewater treatment plants

    Get PDF
    This document is the report of the thesis "Normative thinking on wastewater treatment plants". This thesis was born from the interest of the author in Artificial Intelligence (A.I.). Having done all the subjects related with AJ. that the Barcelona School of Informatics (FIB) offers, I asked the teachers of my favorite ones for a thesis related with the A.I. . Ulises Cortés and Juan Carlos Nieves offered me this interesting thesis based on a doctoral thesis of environmental sciences done by Montse Aulinas [23]. The proposed work implied theoretical research, a working implementation and a real life domain to work with. I accepted without any doubt. Aulinas's thesis proposed a multi-agent based system to manage the problems caused by the industrial wastewater discharges in rivers. She discussed that, by the use of intelligent agents in the managing process of wastewaters, there could be an important increase in the quality of the river water and in the efficiency from the organizational point of view. To do that she proposed a group of agents, which would take the roles of the most important entities in the process of wastewater discharges, from industries to the agencies in charge of controlling them, in order to represent all the involved parts. It is obvious that, for the agents to be able to work rationally, they need to interact with the laws they are subject too That is the main issue this thesis deals with. Based on a real world doma in, this thesis proposes a way to make those laws to be comprehensible for agents. It will discuss a methodology for analyzing, specifying, implementing and testing those laws, in a generic way that can be applied to any normative environment. The goals of this thesis are, To obtain a generic and complete specification syntax for analyzing laws and norms, prove that specification with an implementation of reallaws applied to the given doma in and To develop a prototype where the norms implementation can be tested using a possible real scenario

    Link prediction in very large directed graphs: Exploiting hierarchical properties in parallel

    Get PDF
    Link prediction is a link mining task that tries to find new edges within a given graph. Among the targets of link prediction there is large directed graphs, which are frequent structures nowadays. The typical sparsity of large graphs demands of high precision predictions in order to obtain usable results. However, the size of those graphs only permits the execution of scalable algorithms. As a trade-off between those two problems we recently proposed a link prediction algorithm for directed graphs that exploits hierarchical properties. The algorithm can be classified as a local score, which entails scalability. Unlike the rest of local scores, our proposal assumes the existence of an underlying model for the data which allows it to produce predictions with a higher precision. We test the validity of its hierarchical assumptions on two clearly hierarchical data sets, one of them based on RDF. Then we test it on a non-hierarchical data set based on Wikipedia to demonstrate its broad applicability. Given the computational complexity of link prediction in very large graphs we also introduce some general recommendations useful to make of link prediction an efficiently parallelized problem.Peer ReviewedPostprint (published version

    Link prediction in large directed graphs

    Get PDF
    The first chapter introduces an approach to machine learning (ML) were data is understood as a network of connected entities. This strategy seeks inter-entity information for knowledge discovery, in contrast with traditional intra-entity approaches based on instances and their features. We discuss the importance of this connectivist ML (which we refer to as graph mining) in the current context where large, topology-based data sets have been made available. Chapter ends by introducing the Link Prediction (LP) problem, together with its current computational and performance limitations. The second chapter discusses early contributions to graph mining, and introduces problems frequently tackled through this paradigm. Later the chapter focuses on the state-of-the-art of LP. It presents three different approaches to the problem of finding links in a relational set, and argues about the importance of the most computationally scalable one: similarity-based algorithms. It categorizes similarity-based algorithms in three types of LP scores. For the most scalable type, local similarity-based algorithms, the chapter identifies and formally describes the most competitive proposals according to the bibliography. Chapter three analyses the LP problem, partly as a classic binary classification problem. A list of graph properties such as directionality, weights and time are discussed in the context of LP. Follows a formal time and space complexity analysis of similarity-based scores of LP. The chapter ends with an study of the class imbalance found in LP problems. In chapter four a novel similarity-based score of LP is introduced. The chapter first elaborates on the importance of hierarchies for representing knowledge through directed graphs. Several modifications to the proposed score are also defined. This chapter presents a modified version of the most competitive undirected scores of LP, to adapt them to directed graphs. The evaluation methodologies of LP are analyzed in the fifth chapter. It starts by discussing the problem of evaluating domains with a huge class imbalance, identifying the most appropriate methodologies for it. A modification of the most appropriate evaluation methodology according to the bibliography is presented, with the goal of focusing on relevant predictions. Follows a discussion on the faithful estimation of the precision of predictors. Chapter six describes the graphs used for score evaluation, as well as how data was transformed into a directed graph. Reasons on why these particular domains were chosen are given, making a special case of webgraphs and their well known relation with hierarchies. The most basic properties of each resultant graph are shown. Tests performed are presented in chapter seven. The three most competitive LP scores currently available are tested among themselves, and against a proposed version of those same scores for directed graphs. Our proposed score and its modifications are tested against the scores obtaining the best results in the previous tests. The case of LP in webgraphs is considered separately, testing six different webgraphs. The chapter ends with a discussion on the limitations of this formal analysis, showing examples of predictions obtained. Chapter eight includes the computational aspects of the work done. It starts with a discussion on the importance of memory management for determining the computational cost of LP algorithms. A proposal on how to reduce this cost through precision reduction is presented. Follows a section focused on the parallelization of code, which includes two different implementations on one graph-specific programming model (Pregel) and on one generic programming model (OpenMP). The chapter ends with a specification of the computational resources used for the tests done. The conclusions of this thesis proposal are presented in nine. Chapter ten contains several future lines of work.El primer capítol introdueix una perspectiva de l'aprenentatge automàtic on les dades s'entén com una xarxa d'entitats connectades. Aquesta estratègia es centra en les relacions entre entitats per aprendre, en contrast amb les solucions tradicionals basades en instancies i els seus atributs. Discutim sobre la importància d'aquesta perspectiva connectivista (a la que ens referim com mineria de grafs) en el context actual on grans conjunts de dades basats en xarxes estan apareixent. El capítol finalitza amb la presentació del problema de Predicció d'Arestes (PA), junt amb una primera anàlisi de les seves limitacions actuals. El segon capítol presenta les primeres contribucions a la mineria de grafs, introduint problemes típicament solucionats mitjançant aquest paradigma. El capítol es centra en l'estat de l'art de PA. Presenta tres solucions diferents per al problema i argumenta la importància del més computacionalment escalable: els algoritmes basats en similitud. Categoritza aquests en tres tipus, i per als més escalables d'aquests, els algoritmes locals, s'identifica i es descriu formalment les propostes més competitives d'acord amb la bibliografia. El tercer capítol analitza el problema de PA, inicialment com a problema de classificació binari. Una llista de propietats de grafs són discutides en el context de la PA, com la direccionalitat o els pesos. Segueix una anàlisi del cost computacional en temps com en espai, dels algorismes basats en similitud. El capítol finalitza amb un estudi del desbalanceig de classes, freqüent en la PA. Al capítol quatre es presenta un nou algorisme basat en similitud per la PA. El capítol elabora sobre la importància de les jerarquies a la representació del coneixement a través de grafs dirigits. Varies modificacions es proposen per al nou algorisme. Aquest capítol també inclou una modificació sobre els actuals algorismes de similitud per a grafs no dirigits, per adaptar-los per a grafs dirigits. Les metodologies d'avaluació de la PA s'analitzen al cinquè capítol. Comença amb una discussió sobre els problemes que suposa avaluar un context amb un gran desbalanceig de classes, identificant les metodologies apropiades per aquests casos. Es proposa una modificació sobre el mètode més apropiat actualment disponible, per tal de centrar-se en les prediccions rellevants. Segueix una discussió sobre l'estimació fidedigna de la precisió dels predictors. El sisè capítol descriu els grafs usats per avaluar els algorismes, així com la metodologia usada per transformar-los en grafs dirigits. Les raons per triar aquest conjunt de grafs són exposades, posant especial interès al cas dels grafs web i a la seva ben coneguda relació amb les jerarquies. Les propietats més bàsiques de cada graf resultant són descrites. Els tests efectuats es mostren al capítol setè. Els tres algorismes actuals de PA més competitius són comparats amb ells mateixos i amb la versió per a grafs dirigits definida anteriorment. L'algorisme proposat anteriorment i les seves modificacions també són avaluats. El problema de la PA en grafs web es considera per separat, avaluant sis grafs web diferents. El capítol acaba amb una discussió sobre les limitacions de les avaluacions formals, mostrant exemples de prediccions obtingudes. El vuitè capítol inclou els aspectes computacionals de la tesi. Comença amb una discussió sobre la importància de la gestió de memòria per a la definició del cost computacional dels algorismes de PA. Inclou una proposta sobre com reduir aquest cost mitjançant una reducció en la precisió. Segueix una secció centrada en la paral·lelització del codi, que inclou dues implementacions diferents, una en un model de programació específic per grafs (Pregel) i una amb un model de programació paral·lela genèric (OpenMP). El capítol finalitza amb una especificació dels recursos computacionals usats per als tests realitzats. Les conclusions de la tesi es presenten al capítol novè, i les línies de treball futur al des

    Normative thinking on wastewater treatment plants

    Get PDF
    This document is the report of the thesis "Normative thinking on wastewater treatment plants". This thesis was born from the interest of the author in Artificial Intelligence (A.I.). Having done all the subjects related with AJ. that the Barcelona School of Informatics (FIB) offers, I asked the teachers of my favorite ones for a thesis related with the A.I. . Ulises Cortés and Juan Carlos Nieves offered me this interesting thesis based on a doctoral thesis of environmental sciences done by Montse Aulinas [23]. The proposed work implied theoretical research, a working implementation and a real life domain to work with. I accepted without any doubt. Aulinas's thesis proposed a multi-agent based system to manage the problems caused by the industrial wastewater discharges in rivers. She discussed that, by the use of intelligent agents in the managing process of wastewaters, there could be an important increase in the quality of the river water and in the efficiency from the organizational point of view. To do that she proposed a group of agents, which would take the roles of the most important entities in the process of wastewater discharges, from industries to the agencies in charge of controlling them, in order to represent all the involved parts. It is obvious that, for the agents to be able to work rationally, they need to interact with the laws they are subject too That is the main issue this thesis deals with. Based on a real world doma in, this thesis proposes a way to make those laws to be comprehensible for agents. It will discuss a methodology for analyzing, specifying, implementing and testing those laws, in a generic way that can be applied to any normative environment. The goals of this thesis are, To obtain a generic and complete specification syntax for analyzing laws and norms, prove that specification with an implementation of reallaws applied to the given doma in and To develop a prototype where the norms implementation can be tested using a possible real scenario

    Hierarchical inference applied to Cyc

    Get PDF
    Hierarchical graphs are a frequent solution for capturing symbolic data due the importance of hierarchies for defining knowledge. In these graphs, relations among elements may contain large portions of the element’s semantics. However, knowledge discovery based on analyzing the patterns of hierarchical relations is rarely used. We outline four inference based algorithms exploiting semantic properties of hierarchically represented knowledge for producing new links, and test one of them on a generalization of Cyc’s KB. Finally, we argue why such algorithms can be useful for unsupervised learning and supervised analysis of a KBPeer ReviewedPostprint (author’s final draft

    Bringing action language C+ to normative contexts: preliminary report

    Get PDF
    C+ is an action language for specifying and reasoning about the e ects of actions and the persistence of facts over time. Based on it. we present CN+, an operational enhanced form of C+ designed for representing complex normative systems and integrate them easily into the semantics of the causal theory of actions. The proposed system contains a particular formalization of norms using a life-cycle approach to capture the whole normative meaning of a complex normative framework. We discuss this approach and illustrate it with examples.Peer ReviewedPostprint (author’s final draft

    Focus! rating XAI methods and finding biases

    Get PDF
    Explainability has become a major topic of research in Artificial Intelligence (AI), aimed at increasing trust in models such as Deep Learning (DL) networks. However, trustworthy models cannot be achieved with explainable AI (XAI) methods unless the XAI methods themselves can be trusted. To evaluate XAI methods one may assess interpretability, a qualitative measure of how understandable an explanation is to humans [1]. While this is important to guarantee the proper interaction between humans and the model, interpretability generally involves end-users in the process [2], inducing strong biases. In fact, a qualitative evaluation alone cannot guarantee coherency to reality (i.e., model behavior), as false explanations can be more interpretable than accurate ones. To enable trust on XAI methods, we also need quantitative and objective evaluation metrics, which validate the relation between the explanations produced by the XAI method and the behavior of the trained model under assessment. In this work we propose a novel evaluation score for feature attribution methods, described in §I-A. Our input alteration approach induces in-distribution noise into samples, that is, alterations on the input which correspond to visual patterns found within the original data distribution. To do so we modify the context of the sample instead of the content, leaving the original pixels values untouched. In practice, we create a new sample, composed of samples of different classes, which we call a mosaic image (see examples in Figure 2). Using mosaics as input has a major benefit: each input quadrant is an image from the original distribution, producing blobs of activations in each quadrant which are consequently coherent. Only the pixels forming the borders between images, and the few corresponding activations, may be considered out of distribution. By inducing in-distribution noise, mosaic images introduce a problem in which XAI methods may objectively err (focus on something it should not be focusing on). On those composed mosaics we ask a XAI method to provide explanation for just one of the contained classes, and follow its response. Then, we measure how much of the explanation generated by the XAI is located on the areas corresponding to the target class, quantifying it through the Focus score. This score allows us to compare methods in terms of explanation precision, evaluating the capability of XAI methods to provide explanations related to the requested class. Using mosaics has another benefit. Since the noise introduced is in-distribution, the explanation errors identify and exemplify biases of the model. This facilitates the elimination of biases in models and datasets, potentially resulting in more reliable solutions. We illustrate how to do so in §I-C
    • …
    corecore